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Abstract 

A kernel based procedure for correcting experimental data for distortions 
due to the finite resolution and limited detector acceptance is presented. 
The unfolding problem is known to be an ill-posed problem that can not be 
^ solved without some a priori information about solution such as, for example, 

i-£h . smoothness or positivity. In the approach presented here the true distribution 

is estimated by a weighted sum of kernels, with the width of the kernels acting 
gularization parameter responsible for the smoothness of the result. 
Cross-validation is used to determine an optimal value for this parameter. 
A numerical example with a simulation study of systematical and statistical 
errors is presented to illustrate the procedure. 

Key words: unfolding, kernel, apparatus function, inverse problem, 
regularization 
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1. Introduction 

In this paper the 1-dimensional unfolding problem will be addressed. Here 
the probability density function (PDF) P(x') of an experimentally measured 
characteristic x' in general differs from the true physical PDF p(x) because 
of the limited acceptance (probability) A(x) to register an event with true 



"Tel.: +354-4608505; fax: +354-4608998 
Email address: nikolaiOunak . is (N.D. Gagunashvili) 



Preprint submitted to Elsevier 



September 19, 2012 



characteristic x and finite resolution in the response function R(x'\x), the 
probability to observe x' for a given true value x. Formally the relation 
between P(x') and p(x) is given by 

P(x')oc / p{x)A(x)R{x'\x) dx . (1) 
Jq 

The integration in ([T]) is carried out over the domain Q of the variable x. In 
practical applications the experimental distribution is usually discretized by 
using a histogram representation, obtained by integrating P(x') over n finite 
size bins 

Pj = / P( x ')dx' j = 1, ... ,n (2) 



with Cj_i, Cj the bounds of bin j. 

If a parametric (theoretical) model Pt(%, ai, ^2) ■ ■ ■ , a*) for the true PDF 
is known, then the unfolding can be done by determining the parameters 
in a least squares fit to the binned data [1] or a maximum likelihood fit to 
the unbinned data. In both cases the a priori information which is needed 
to correct for the distortions by the experimental setup is the fit model, 
which allows to describe the true distribution by a finite number of parameter 
values. 

Model independent unfolding to identify a physical distribution, as con- 
sidered in (i-12|, is an underspecified problem and every approach to solving 



it requires a priori information about the solution. Different methods differ, 
directly or indirectly, in the use of this a priori information. 

The remainder of the paper is organized as follows. In section [2] a new 
method for solving the unfolding problem will be presented. Properties of 
the algorithm are discussed in section [3] and illustrated in section @] by ap- 
plying it to a numerical example proposed in 0| and also used in Refs. p, 0]. 
Conclusions are given in section 

2. Description of the unfolding method 

To solve the unfolding problem ([1]) the following ansatz for p(x) will be 
used 

s 

p(x) = w + ^2wiK(x,Xi,X), (3) 
t=i 
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where the true distribution is written as an offset w plus a weighted sum of 
s kernel functions (PDFs) K(x, Xj, A), % = 1, . . . s, with non-negative weights 
Wi, central locations xi and a scale parameter A which determines the width 
of the kernel. Kernels are widely used for the estimation of a PDF 13[ as 



well as in non-parametric regression analysis [14[ . Note that Eq. ([3]) uses only 
kernels of one type with a common scale parameter. The only difference 
between different kernels is the location of the center. In this paper we 
will only consider this simplified case. In principle the approach could be 
generalized to vary also functional form and scale parameter. 

Using Eq.(|3]) to parametrize the solution p(x) reduces the unfolding prob- 
lem of finding a solution from the infinitely many dimensional space of all 
functions to finding a solution in a finite dimensional space. This way a 
discretization is performed which, in contrast to e.g. a discretization by a 
histogram, has the advantage to introduce negligible quantization errors for 
sufficiently smooth distributions. 

The following discussion will focus on symmetric kernels, although, de- 
pending on the kind of problem one attempts to solve, also asymmetric ker- 
nels may be appropriate. The a priori information of p(x) being proportional 
to a PDF is incorporated by accepting only positive weights. The scale pa- 
rameter of the kernel functions acts as a regularization parameter which 
allows to adjust the smoothness of the result. Weights, locations and the 
number of kernel functions needed to estimate p(x) will be determined by 
the unfolding procedure described below. 

Below examples of smooth symmetric kernels K{xi — x) = K(xi + x) 
are presented. All kernels are PDFs which are normalized to unity when 
integrating over x. For convenience, in all cases the variable u = (x — Xi)/X 
is used. With the indicator function 

f 1 if u satisfies the condition in the brackets 
otherwise 

a class of polynomial kernels is defined by 

%A) = ^(l-Ht/{H<i}- (4) 
Often used are the following special cases: 



3 



kernel 


a 


b 


N(a,b) 


Epanechnikov 


2 


1 


4/3 


Biweight 


2 


2 


15/16 


Triweight 


2 


3 


35/32 


Tricube 


3 


3 


70/81 



Commonly employed non-polynomial kernels are: 

7T ITU 

Cosine: K(u, A) = — cos(— ) I{|„j<i} (5) 
Cauchy: K(u,X) = T~(y~ — jj) (6) 

ATT 1 + U Z 

1 

Gaussian: A) = — l= e 2 (?) 

Av 27r 

Also frequently used is the piecewise defined cubic B-spline 

' (2m + 2) 3 / { _ 1 < u< _ .5} 
(1 + 3(2« + 1)(1 - 2u{2u + 1))) / { -o.5< u <o} 
(1 - 3(2u - 1)(1 - 2u(2u - 1))) / {0 < u < .5} 
k (2 - 2m) 3 7 {0 .5< u <i} 
Re- writing Eq.(j3]) in the form 



*(«,A) = i< 



(8) 



p{x) ^WiUx) with *<(*) = { X K{ tr\>l (9) 

and substituting this into the basic equation (jTJ) yields 

P(x') = y^Wi [ K i (x)A(x)R(x'\x)dx (10) 



Taking statistical fluctuations into account, the relation between the 
weights Wi and the histogram of the observed distribution becomes a lin- 
ear equation 

P = Qw + e, (11) 

where P is the n-component column vector of the experimentally measured 
histogram, w = (u> , Wi, w s )' is (s + l)-component vector of weights and 
Q is an n x (s + 1) matrix with elements 

Qji= Ki(x) A(x) R{x'\x) j = l,...n i = 0,...s. (12) 



The vector e is an n-component vector of random residuals with expec- 
tation value E[e\ = and covariance matrix C with diagonal elements 
Var[e] = diag(<rf , of, - • • , ex 2 ), where cxj is the statistical error of the mea- 
sured distribution for the zth bin. Each column of matrix Q is the response 
of the system to the true distribution represented by the respective kernel. 
Numerically the calculation of the column vectors can be done by weighting 
the events of a Monte Carlo sample such that they follow the distribution the 
corresponding kernel, see Ref. and taking the histogram of the observed 
distribution obtained with the weighted entries. 

For a given set of kernels the weights w in Eq. (TlT]) can be determined by 
a linear least squares fit. In order to have an as flexible as possible model, 
the candidate kernels in principle could have a continuous range of central 
positions. In practical applications it will usually be sufficient to consider 
a discrete set with a spacing significantly smaller than the bandwidth A. 
The goal then is to find a subset of kernels for the final fit which provides a 
good description of the data and where all weights are positive and signifi- 
cantly different from zero. This at the same time stabilizes the solution and 
guarantees positiveness. 



To find such an optimal subset, a forward stepwise algorithm [15| is used. 
It requires a criterion for the quality of the fit which will be taken the test 
statistic Xf, 

X? = (P -Qw) T C-\P -Qw) (13) 

where the index I denotes the number of weights in the fit and w is determined 
such that it minimizes Xf. The solution w and its covariance matrix C w are 
given by the well known expressions 

w = (Q T C- 1 Q)- 1 (Q T C- 1 )P and C w = {Q T C~ 1 Q)- 1 . (14) 

If the underlying distribution of the measured histogram P can be described 
by a linear combination of the columns of Q, then the Xf statistics follows 
a ^-distribution with n — I degrees of freedom. 

Now assume a total of s candidate kernel function Ki(x),i = 1, . . . , s with 
centers evenly spaced along the possible values x of the true distribution. In a 
first step the weight wq is determined by fitting only the constant function Kq 
to the data. Then an iterative procedure starts with alternating "Forward" 
and "Backward" steps described below. 

Given a fit model consisting of I kernels, in the next Forward step each 
of the other s — I kernels is tried for inclusion into the model. From all 
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combinations that one is selected where all weights are positive and which 
gives the largest reduction in Xf. If no such fit is found then the procedure 
stops. Otherwise the new kernel is included into the model if 

X[ ~ Xl+1 (n-l-l)>F m , (15) 
A /+i 

i.e. if the reduction in \ 2 is sufficiently large. Also in case the best fit does 
not satisfy Eq. (fT5l) the procedure stops. After accepting a new kernel into 
the model a Backward step is performed. Here in turn each of the previously 
included kernels is removed from the model and the test-model fitted to the 
data. From all fits which have only positive weights the one with the smallest 
increase in X? is taken. If the increase is below a certain threshold 



n - I) < F out (16) 



the respective kernel is removed from the model, and the Backward step is 
iterated with the reduced model. If no kernel is removed then again a Forward 
Step is tried. The procedure stops if neither a Forward, nor a Backward Step 
can be done. 

For the stepwise method defined above, appropriate thresholds F in and 
F out must be chosen. Usually one uses F in = F out = F Q . There is no common 



opinion about the best value for this constant. Reference [16(| for example 



used F = 2.5, the authors of Ref. 17] used F = 3.29 for the same sample of 



data. To allow the inclusion of as many kernels as possible into the model, 
very small values Fq can be used. 

When the method stops an estimate p(x) has been found, defined by the 
locations Xi, i — 1 . . . , k of a set of kernel functions which are summed with 
weights Wi, i = 0, . . . , k to yield 

k 

p{x) = Y,^ l K l {x) . (17) 

The error band around p(x) is given by -\/var[p(x)], obtained by setting x = y 
in the expression for the covariance between any two points x and y 

k 

cov\p(x),p(y)} = K i( x ) K Av) ■ (18) 

i,j=0 
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A histogram representation for the unfolded distribution p(x) with m bins 
integrating over the x- intervals bj\, % = 1, . . . , m is obtained by 



p = K w, 



(19) 



where K is an m x (k + 1) matrix with elements 



K, ;i 




Kj(x) dx . 



(20) 



The covariance matrix of p is given by 



C p = K r C4K. 



(21) 



Note that this matrix is singular when the number of weights is smaller than 
the number of bins in the histogram of the unfolded distribution. 

3. Discussion 

The unfolding algorithm described above defines a generic approach to 
represent measured information about a true physical distribution in a com- 
pact way The fact that the model is specified with proper statistical errors 
allows a quantitative comparison between an independent theoretical model 
and the unfolding result when working on the subspace spanned by the model 
p(x). To test the hypothesis that the underlying distribution of the unfolding 
result has the shape Pt(x), one can use the histogram representation of p T 
with the same binning as for p. In case of a non-singular covariance matrix 
C p a x 2 -test can be applied directly on the binned distributions. If the num- 
ber of bins for the unfolded distribution is larger than the number of weights, 
the comparison can still be done in the space spanned by the weights. In this 
case the weight vector iu T for expanding p T into the kernels K is given by 



which, in analogy to Eq. (|T^|) is simply the unweighted fit of the kernel func- 
tions used to describe the model to the theoretical prediction p T . If pr{x) 
is indeed the underlying distribution of the unfolding result, then the test 
statistic 



w T = (K T K) _1 Kpr 



(22) 



X 2 = (w - w T ) T cr 1 (w - w T ) 



(23) 
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has a ^-distribution with k + 1 degrees of freedom, the rank of the matrix 
C w . It has to be emphasized that the above test constitutes only a necessary- 
condition for a theoretical prediction to describe the data. It is not a suffi- 
cient one, as examples can be constructed where additional kernels would be 
needed to properly model the prediction, which may be known to be absent 
in the data and thus are ignored in the test. In practical applications one 
therefore also should make sure that Kw T provides a good model for px(x). 

In principle any smooth kernels can be used and in practice results do 
not vary significantly when switching between the functions discussed before. 
The choice of the optimal type of kernel function and the value of the scale 
parameter A for a given problem is driven by the quality of the fit. Common 
tools to asses the fit quality in regression analysis are: 



1. p- value of fit 

2. analysis of the normalized residuals of the data 

(a) as a function of the estimated value P 

(b) as a function of the observed value x' 

3. Q-Q plot: quantile of normalized residuals versus the theoretical quan- 
tile expected from a standard normal W(0, 1) distribution 

The positions of the kernels considered in the algoritm should cover the 
entire allowed range of x with a spacing significantly smaller than the width 
given by the scale parameter A. In order to avoid loss of information due 
to binning the number of bins for the measured histogram P should be as 
large as possible although, in order to have meaningful error estimates for 
the least squares fits that determine w, the number of entries in a single bin 
should not be less than ~ 25. 

An issue left open in the definition of the unfolding algorithm is the deter- 
mination of the scale parameter A of the kernel functions. Evidently, larger 
values will in general result in a more smooth estimate for the true distri- 
bution but may lead to bad fits of the observed distribution when narrow 
features cannot be accommodated. Too small values of A, on the other hand, 
will favor overfitting of statistical fluctuations in the data. In general one 
will therefore try a range of values for A and, in order to find some optimal 
balance between smoothness of the result and overfitting of the data, select 
a parameter in the region just below the largest value which provides a sat- 



isfactory fit to the data. In the literature [2lH24| the use of cross-validation 



or bootstrap methods is suggested to find the optimal solution. Here we will 
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use a simple leave-one-out cross-validation approach 24| to determine the 
best value for A. 

Finally it should be noted that the unfolding method described above 
does not take into account uncertainties in the matrix Q which relates the 
weight vector w to the measurements P. Therefore, when Q is determined 
by means of a Monte Carlo simulation the Monte Carlo sample should be 
significantly larger than the data sample. 

4. A numerical example 

The method described above is now illustrated with an example pro- 
posed by Blobel [i| and for illustration also used elsewhere 0, 0] • The true 
distribution, defined on the range x G [0,2] is described by a sum of three 
Breit-Wigner functions 

4 0.4 0.2 

P ^ X > * (x - 0.4) 2 + 4 + (x- 0.8) 2 + 0.04 + (x - 1.5) 2 + 0.04 ( ' 

from which the experimentally measured distribution is obtained by 

P(x') oc / p(x)A(x)R(x'\x)dx, (25) 
Jo 

with an acceptance function A(x) 

(x - l) 2 

A(x) = 1 - ^L-tL (26) 

and a resolution function describing a biased measurement with gaussian 
smearing 

, . , v 1 / (x'-x + 0.05x 2 ) 2 \ 
R(x'\x) = ^^exp '— with a = 0.1 . (27) 



The acceptance and resolution functions are shown in Fig.[U Also shown is 
an example for the measured distribution obtained by simulating a sample 
of N = 5000 events. 

For the determination of the matrix Q a sample of 500 000 Monte Carlo 
events was simulated. The true distribution was taken uniform and the kernel 
responses were calculated by weightingthe Monte Carlo events with weights 
proportional value of kernel function A set of 100 gaussian kernels was 
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(x'io.; 


5)aR 


(x'I1.0) a R(x'I1.5) 
-W — —A A(x) 







0.5 1 1.5 2 




x x 

Figure 1: The acceptance function A{x) and resolution function R(x'\x) for x — 0.5, 1.0 
and 1.5 (left) and histogram of the measured distribution P based on a sample of 5000 
events generated for the true distribution (right). The true distribution p(x) is shown by 
the curve. 



used with positions uniformly distributed over the interval [0,2]. For the 
nominal analysis a scale parameter A = 0.175 and a threshold value Fq = 10~ 4 
in the stepwise algorithm was chosen. 

The estimate for the true distribution obtained by the unfolding method 
described above is represented by a constant plus a weighted sum of seven 
kernels. The positions and weights of the kernels determined by the stepwise 
algorithm together with the errors and correlation matrix of the weights is 
listed in Tab.[U The quality of the unfolding result is illustrated by Fig.EJ It 
shows the superposition of the folded kernels approximates the measured dis- 
tribution together with the analysis of the residual and the quantile-quantile 
plot. No structure in either of the control plots is observed. The p- value from 
the test for the comparison of the histogram of the measured distribution P 
and the fitted histogram P, Fig.[TJa), is p = 0.23. 

Tabled] gives the results for a scale parameter A = 0.175 of the gaussian 
kernels. To illustrate the effect of this parameter, Fig. [3] shows how the 
unfolding results varies with A. The components of the unfolding results 
are shown together with the estimate p(x). Also shown are the error bands 
±2-^/var[p(x)] compared to the true distribution p{x). FigureH] illustrates for 
an even larger range of A how the fit quality varies with the scale paramater. 
One clearly sees that large values A > 0.2 lead to a bad fit with a p- value 
p < 0.05. Here also significant structures in the residuals and in the quantile- 
quantile plots are observed. The smallest value shows some indication of 
overfitting. The best parameters for this example evidently are in the range 
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Theoretical quantile 



Figure 2: Illustration of the quality of the unfolding result, (a) folded kernels of the 
estimate of the true distribution compared to the measured distribution; (b) normalized 
residuals of the fit as a function of P; (c) normalized residuals as a function of x'\ (d) 
quantilc-quantile-plot for the normalized residuals. 



between 0.15 < A < 0.20. 

This is confirmed when doing a most simple leave-one-out cross validation, 
removing in turn each bin of the measured distribution and calculating the 



predicted residual sum of squares [24( as a function of A 



J£ = ±1*ZM.. (28) 
1=1 1 

Here Pu\ is the estimator for the content of the zth bin of the observed 
distribution P, calculated by excluding this bin from the unfolding procedure 
or from the determination of the weights for the kernels selected by the 
unfolding procedure. The results of the calculation of X^ T /n for different 
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Table 1: Positions of kernels Xi, weights w%, errors of weights 5f and correlation matrix 
for the weights determined by the unfolding algorithm. 



i 




Wi 


8f 





1 


2 


3 


4 


5 


6 







1456.3 


268.4 
















1 


0.33 


122.8 


282.6 


-0.70 














2 


0.77 


1111.4 


1691.1 


-0.51 


0.81 












3 


1.18 


79.1 


919.4 


0.54 


-0.65 


-0.86 










4 


1.11 


137.3 


1139.7 


-0.55 


0.68 


0.90 


-0.99 








5 


0.82 


891.3 


1816.6 


0.49 


-0.78 


-0.99 


0.88 


-0.92 






6 


0.43 


85.1 


350.5 


0.53 


-0.96 


-0.88 


0.66 


-0.70 


0.85 




7 


1.50 


1029.1 


164.4 


-0.82 


0.68 


0.66 


-0.82 


0.80 


-0.66 


-0.58 



scale parameters A is given in Tab.[2j The minimal value of Xp r /n = 1.29 
is achieved for A = 0.2. The choice of A = 0.175 with Xp r /n = 1.32 gives a 
solution with ~ 20 % larger statistical errors than for A = 0.2 but, as will be 
discussed in more detail below, has a lower bias. The solutions with A < 0.15 
can be considered as overfitting the data while A > 0.20 underfits them. 

Table 2: The p- values and average predicted residual sum of squares X^ r /n for different 
values of the scale parameter A. 



A 


0.1 


0.125 


0.15 


0.175 


0.2 


0.225 


0.25 


0.275 


p- value 


0.21 


0.22 


0.20 


0.23 


0.21 


0.05 


0.01 


0.00 


Xlln 


1.77 


1.42 


1.46 


1.32 


1.29 


1.58 


1.84 


2.02 



To investigate the statistical properties of the unfolding procedure, M = 
1000 simulation runs were performed producing statistically independent 
measured histograms, each based on N = 5000 events for the same true 
distribution (124"]) . The unfolded distribution was calculated for each mea- 
sured distribution. For the comparison between the unfolding results and 
the true distribution a histogram representation is used with m = 12 and 
alternatively m = 40 bins. Bin contents are normalized to the bin width in 
order to make the bin contents independent of the binning. The following 
quantities are considered for each bin i of the unfolded distribution. 

• pf exact value of bin i of the true distribution 

N f Xi 
Pi = / p(x) dx 

Xi Jxi-i 
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Figure 3: Components of the unfolded distribution and the unfolded distribution p{x) given 
by the sum of the components with ±28{x) interval (left) and the error band overlaid with 
the true distribution p(x) (right) for different values of the scale parameter A. 
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pf run-averaged value of bin i of the unfolded distribution 



M 



• B[pi\: bias in bin i of the unfolded distribution 

B[pi] =Pi-Pi 

• sf run- averaged standard deviation for bin i 

M 



^ M 



• 5f run-averaged error estimate for bin i 

1 M 

• B[Si\: bias on the error of bin % 

B[Si] = 5i- Si 

• RMSEj: run- averaged Root Mean Square Error for bin % 

1 M 

RMSE * = m 5> <(j ) " Pi? = S * + ^ 

3=1 

In addition to the bin-dependent quantities some global measures for the 
quality of the unfolding result are defined by summing over all m bins of the 
unfolded distribution. 

• TRMSB: Total Root Mean Square Bias 

TRMSB = 



m 

1=1 



15 



• TRMSV: Total Root Mean Square Variance 



iil 

TRMSV = , -Vs? 



m 



• TRMSE: Total Root Mean Square Error 



1 



rn 



TRMSE 



m 



RMSE 2 = VTRMSB 2 + TRMSV 2 



i=i 



Numerical calculations of the characteristics of the unfolding procedure 
for 12 bins and gaussian kernels with A = 0.175 are presented in Tab.0 One 
sees that the bias is small compared to the statistical errors of the unfold- 
ing result and that the error estimates agree well with the actual scatter of 
the results. A visual representation of these findings for different values A 
is given in Fig. [5] for m = 12 and m = 40 bins of the unfolded distribution. 
At the resolution of 12 bins the unfolding result is consistent with the true 
distribution, at 40 bins and A = 0.2 one observes some systematic effects in 
the bias distributions. The bias gets smaller with decreasing values A, but 
the errors become larger, which illustrates the well known "bias-noise com- 
plementary law" j^] that the noise grows when the regularization parameter 
tends to zero. 

The behavior of the global characteristics using 12 or 40 bins for the 
unfolded distribution is shown in Fig.0 The behavior in both cases is very 
similar. The plots show how with increasing scale parameter A, i.e. stronger 
regularization, statistical errors decrease while the bias increases. Adding 
both contributions in quadature, the Total Root Mean Square Error shows 
a minimum around A = 0.175, i.e. in the region also favored by the cross- 
validation approach for the determination of A. 

5. Conclusions 

A new method for unfolding the true distribution from experimental data 
is presented. The unfolding problem is known as an ill-posed problem which 
can not be solved without some a priori information about solution. Smooth- 
ness and positiveness are examples for this type of information. In the pro- 
posed algorithm the unknown true distribution is represented as a weighted 
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Table 3: Exact values of the bins of the true distribution pi, average values pi from the 
unfolding procedure, bias B[pj], standard deviation Sj, mean error Si, bias of the calculated 
errors B[Si] and Root Mean Square Errors RMSEi for A = 0.0175. 



% 


Pi 


Pi 


B[pd 


Si 


5 t 


B[6i] 


RMSEi 


1 


913. 


900. 


-13. 


119. 


123. 


4. 


120. 


2 


1152. 


1146. 


-5. 


121. 


123. 


2. 


121. 


3 


1631. 


1570. 


-61. 


123. 


129. 


6. 


137. 


4 


2760. 


2813. 


53. 


152. 


156. 


5. 


161. 


5 


4941. 


4793. 


-149. 


167. 


184. 


17. 


223. 


6 


5011. 


4981. 


-30. 


177. 


190. 


13. 


180. 


7 


3018. 


3044. 


26. 


157. 


164. 


7. 


159. 


8 


2284. 


2119. 


-165. 


146. 


157. 


11. 


220. 


9 


2718. 


2797. 


79. 


159. 


167. 


8. 


177. 


10 


3073. 


2948. 


-125. 


144. 


165. 


21. 


191. 


11 


1779. 


1798. 


19. 


136. 


150. 


14. 


138. 


12 


997. 


983. 


-14. 


131. 


127. 


-4. 


132. 



sum of smooth kernels. The scale parameter of the kernels acts as a reg- 
ularization parameter allowing to adjust the smoothness of the result. A 
cross-validation approach is proposed to determine an optimal value of this 
parameter. The method avoids discretization of the integral equation which 
is often done by unfolding methods and is an additional source of bias for 
the solution of unfolding problem. Various criteria were discussed to gauge 
the quality of the unfolding result. The methods provides a solution for the 
unfolding problem with a non-singular error matrix which can be used to 
test the consistency of a theoretical prediction with the experimental data. 
A numerical example including extensive simulation studies of the statistical 
properties of the method was presented to illustrate and to validate the pro- 
cedure. For the example typical execution times per unfolding were found 
to be around 0.1 s on a 2 GHz CPU. The method can be extended to deal 
with steeply falling spectra or multidimensional distributions and to handle 
properly the case of limited statistics in the determination of the response 
function. 
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12 bins 40 bins 




Figure 6: Global characteristics of the unfolding result for 12 bins (left) and 40 bins (right) 
as a function of the scale parameter A. 
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